68 research outputs found

    Investigating Simple Object Representations in Model-Free Deep Reinforcement Learning

    Get PDF
    We explore the benefits of augmenting state-of-the-art model-free deep reinforcement algorithms with simple object representations. Following the Frostbite challenge posited by Lake et al. (2017), we identify object representations as a critical cognitive capacity lacking from current reinforcement learning agents. We discover that providing the Rainbow model (Hessel et al.,2018) with simple, feature-engineered object representations substantially boosts its performance on the Frostbite game from Atari 2600. We then analyze the relative contributions of the representations of different types of objects, identify environment states where these representations are most impactful, and examine how these representations aid in generalizing to novel situations

    Learning a smooth kernel regularizer for convolutional neural networks

    Get PDF
    Modern deep neural networks require a tremendous amount of data to train, often needing hundreds or thousands of labeled examples to learn an effective representation. For these networks to work with less data, more structure must be built into their architectures or learned from previous experience. The learned weights of convolutional neural networks (CNNs) trained on large datasets for object recognition contain a substantial amount of structure. These representations have parallels to simple cells in the primary visual cortex, where receptive fields are smooth and contain many regularities. Incorporating smoothness constraints over the kernel weights of modern CNN architectures is a promising way to improve their sample complexity. We propose a smooth kernel regularizer that encourages spatial correlations in convolution kernel weights. The correlation parameters of this regularizer are learned from previous experience, yielding a method with a hierarchical Bayesian interpretation. We show that our correlated regularizer can help constrain models for visual recognition, improving over an L2 regularization baseline.Comment: Submitted to CogSci 201

    Learning Inductive Biases with Simple Neural Networks

    Get PDF
    People use rich prior knowledge about the world in order to efficiently learn new concepts. These priors - also known as "inductive biases" - pertain to the space of internal models considered by a learner, and they help the learner make inferences that go beyond the observed data. A recent study found that deep neural networks optimized for object recognition develop the shape bias (Ritter et al., 2017), an inductive bias possessed by children that plays an important role in early word learning. However, these networks use unrealistically large quantities of training data, and the conditions required for these biases to develop are not well understood. Moreover, it is unclear how the learning dynamics of these networks relate to developmental processes in childhood. We investigate the development and influence of the shape bias in neural networks using controlled datasets of abstract patterns and synthetic images, allowing us to systematically vary the quantity and form of the experience provided to the learning algorithms. We find that simple neural networks develop a shape bias after seeing as few as 3 examples of 4 object categories. The development of these biases predicts the onset of vocabulary acceleration in our networks, consistent with the developmental process in children.Comment: Published in Proceedings of the 40th Annual Meeting of the Cognitive Science Society, July 201

    Generating new concepts with hybrid neuro-symbolic models

    Get PDF
    Human conceptual knowledge supports the ability to generate novel yet highly structured concepts, and the form of this conceptual knowledge is of great interest to cognitive scientists. One tradition has emphasized structured knowledge, viewing concepts as embedded in intuitive theories or organized in complex symbolic knowledge structures. A second tradition has emphasized statistical knowledge, viewing conceptual knowledge as an emerging from the rich correlational structure captured by training neural networks and other statistical models. In this paper, we explore a synthesis of these two traditions through a novel neuro-symbolic model for generating new concepts. Using simple visual concepts as a testbed, we bring together neural networks and symbolic probabilistic programs to learn a generative model of novel handwritten characters. Two alternative models are explored with more generic neural network architectures. We compare each of these three models for their likelihoods on held-out character classes and for the quality of their productions, finding that our hybrid model learns the most convincing representation and generalizes further from the training observations.Comment: Published in Proceedings of the 42nd Annual Meeting of the Cognitive Science Society, July 202
    • …
    corecore